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STRONG MEMORYLESS TIMES AND RARE EVENTS IN 
MARKOV RENEWAL POINT PROCESSES 

By Torkel Erhardsson 

Royal Institute of Technology 

Let W be the number of points in (0,t] of a stationary finite- 
state Markov renewal point process. We derive a bound for the total 
variation distance between the distribution of W and a compound 
Poisson distribution. For any nonnegative random variable £ , we con- 
struct a "strong memoryless time" £ such that £ — t is exponentially 
distributed conditional on < t, £ > t}, for each t. This is used to 
embed the Markov renewal point process into another such process 
whose state space contains a frequently observed state which repre- 
sents loss of memory in the original process. We then write W as the 
accumulated reward of an embedded renewal reward process, and use 
a compound Poisson approximation error bound for this quantity by 
Erhardsson. For a renewal process, the bound depends in a simple 
way on the first two moments of the interrenewal time distribution, 
and on two constants obtained from the Radon-Nikodym derivative 
of the interrenewal time distribution with respect to an exponential 
distribution. For a Poisson process, the bound is 0. 

1. Introduction. In this paper, we are concerned with rare events in 
stationary finite-state Markov renewal point processes (MRPPs). An MRPP 
is a marked point process on 1 or Z (continuous or discrete time). Each 
point of an MRPP has an associated mark, or state. The distance in time 
between two successive points and the state of the second point are jointly 
conditionally independent of the past given the state of the first point. A 
renewal process is a special case of an MRPP, and any finite-state Markov 
or semi-Markov process can be constructed using a suitable MRPP, simply 
by defining the state of the process at time t to be the state of the most 
recently observed point of the MRPP. 
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The number of points of a stationary MRPP in (Q,t] with states in a 
certain subset B of the state space is an important quantity in many ap- 
plications. For example, the number of visits to B in (0,t] by a stationary 
Markov chain can be expressed in this way. If points with states in B are 
rare, this quantity should be approximately compound Poisson distributed. 
Heuristically, the set of such points can be partitioned into disjoint clumps, 
the sizes of which are approximately i.i.d., and the number of which are ap- 
proximately Poisson distributed. For a further discussion, see Aldous (1989). 

In this paper, the main result is an upper bound for the total variation dis- 
tance between the distribution of this quantity and a particular compound 
Poisson distribution. The bound can be expressed in terms of the first two 
moments of the interrenewal time conditional distributions, and on two con- 
stants obtained from each Radon-Nikodym derivative of an interrenewal 
time conditional distribution with respect to an exponential distribution, by 
solving a small number of systems of linear equations of dimension at most 
the total number of states. This is explicit often enough to be of considerable 
interest. 

We briefly describe the ideas in the proof. If a single state a € B c is chosen, 
we can construct a bound of the desired kind by expressing the quantity of 
interest as the accumulated reward of an embedded renewal reward process, 
for which the points with state a serve as renewals. We then use Theorem 
5.1 in Erhardsson (2000b) which gives a compound Poisson approximation 
error bound for the accumulated reward. However, the bound is small only if 
points with state a are frequently observed. For many Markov chains, there 
exists a frequently observed state a [see Erhardsson (1999, 2000a, 2001a, b)], 
but in many other cases no such a exists. 

To solve this problem, we study the pair of random variables ((, V), where 
( is the distance between two successive points and V is the state of the 
second point. We construct a probability space containing (£, V) and a third 
random variable £ such that, for all t, conditional on {£ < t, £ > t}, the 
pair (£ — t, V) has the distribution x fj,, where is an exponential (or 
geometric) distribution with mean 7 , and fj, is a fixed distribution. One 
might say that the event {£ < t, £ > i} indicates a loss of memory at or 
before t. For this reason, we call £ a "strong memoryless time." 

Using strong memoryless times, we embed the stationary MRPP into 
another stationary MRPP whose state space contains an additional state 0. 
The points with states different from also belong to the original MRPP. 
The points with state represent losses of memory in the original MRPP, 
and are frequently observed if the original MRPP loses its memory quickly 
enough. The bound is then derived by an application of Theorem 5.1 in 
Erhardsson (2000b) to the accumulated reward of a renewal reward process 
embedded into the new MRPP, for which the points with state serve as 
renewals. 



STRONG MEMORYLESS TIMES AND RARE EVENTS 



3 



In the last section, we compute the bound explicitly for an important 
special case: the number of points in (0,t] of a stationary renewal process in 
continuous time. The bound is if the interrenewal times are exponentially 
distributed, that is, if the renewal process is Poisson. We intend to present 
other applications of our results in the future. 

It should be emphasized that the results in this paper are not limit theo- 
rems, but total variation distance error bounds which are valid for all finite 
parameter values. If desired, they can be used to derive limit theorems for 
various kinds of asymptotics, by showing that the bound converges to 
under these asymptotics. They can also be used to bound the rate of con- 
vergence in limit theorems, by bounding the rate of convergence of the error 
bound. 

It should also be mentioned that the literature contains a number of re- 
sults concerning weak convergence to a compound Poisson point process, 
for special kinds of point processes (e.g., thinned point processes, or point 
processes generated by extreme values). Most of these are pure limit theo- 
rems without error bounds; see, for example, Serfozo (1984) and Leadbetter 
and Rootzen (1988). A few error bounds also exist, but not intended for 
processes of the kind studied in this paper, and derived using methods very 
different from ours; see, for example, Barbour and Mansson (2002). 

The rest of the paper is organized as follows. In Section 2, some basic 
notation is given. In Section 3, we give necessary and sufficient conditions for 
the existence of strong memory less times, and derive some of their relevant 
properties. In Section 4, we derive bounds for the total variation distance 
between the distribution of the number of points of an MRPP in (0, t] with 
states in B and a compound Poisson distribution. In Section 5, we consider 
the number of points in (0,t] of a stationary renewal process, and obtain a 
more explicit expression for the bound. 

2. Basic notation. Sets of numbers are denoted as follows: R = the real 
numbers, Z = the integers, R + = [0, oo), R' + = (0, oo), Z + = {0, 1,2,.. .} and 
7j' + = {1,2,...}. The distribution of any random element X in any measur- 
able space (S,y) is denoted by S£ (X). The Borel cr-algebra of any topolog- 
ical space S is denoted by 38 s- 

A compound Poisson distribution is a probability distribution with a 
characteristic function of the form (f>(t) = exp(— J* R / (1 — e ltx ) dn(x)), where 

7T is a measure on (W + ,38-^> ) such that / R/ (1 Ax)dn(x) < oo. It is de- 
noted by POIS(vr). If ||^|| = vr(R' + ) < oo, then POIS(vr) = ^(Ei=i T i), wh ere 
Jzf (Tj) = 7i"/||7r|| for each i € Z+, U ~ Po(||7r||), and all random variables are 
independent. 

The total variation distance is a metric on the space of probability mea- 
sures on any measurable space (S,,y). It is defined for two such measures 
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v\ and vi by 



d T v(ui,u 2 ) = sup \vi{A) -u 2 {A)\. 

A€J7> 



3. Strong memoryless times. In Theorems 3.1-3.3, we define strong mem- 
oryless times, give necessary and sufficient conditions for their existence, and 
derive some of their relevant properties. Note that Theorem 3.1 holds under 
more general conditions than are needed in Section 4. This will facilitate 
other applications in the future. 

By z/y we mean the exponential distribution with mean 7 . 

Theorem 3.1. Let {C,V) be a random variable taking values in (R+ x 
S, x 5^), where {S,S^) is a measurable space. Let fi be a probability 
measure on {S,S^). Assume that <7:R+ — > [0,1] satisfies 

(3.!) M P «C-^)^) V(€E+ , 

O 7 XAt)(C)>0 

and that G : R+ — > R_|_, defined by G(t) = o"(i)e 7 * , is nondecreasing and right- 
continuous. In particular, these conditions are satisfied if equality holds in 

(3.1) . Then we can define, on the same probability space as (^,V), a non- 
negative random variable £ (called a strong memoryless time) such that 

P(C < t, C < u, V G A) 

(3.2) =F((<tAu,V&A) + a(t)(l - e~ rf ^+) [J,(A) 

V (t, u, A) G R+ x R + x ^, 

and such that P(( < £) = 1 and J*? ((£ — i, V) |C < C > t) = v-y x /i /or eac/i 
t G R+. Conversely, assume that the nonnegative random variable defined 
on the same probability space as ((,V), satisfies P(£ < C) = 1 an d -^((C ~~ 

V)]C <t,C>t) = u 7 X fjL for each t G R+. T/ien o-:R + -> [0,1], de/meti 
er(f) = P(C < t,C > i), satisfies (3.1), and G:R+ -> R +) <ie/med by G(t) = 
a(t)e yt , is nondecreasing and right- continuous. 

Proof. For notational convenience, extend a to a function a : R — > [0,1] 
by defining cr(i) = for each t < 0, and define F:lxlxy-t[0,l] by 
F(t,u,A) =P(C < i Au,V G A) +<r(t)(l - e _7 [ u_ *]+)/i(A). It is easy to see 
that if we can define a random variable (£, £, V) taking values in (R x R x 
S,fixfixy) such that 

P(C < t, C < «, V G A) = F(t, u, A) V (t, u, A) G R x R x J?, 
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then P(C < C) = 1 and Jgf ((£ - t, V)\( < t, ( > t) = u y x p for each i € R+. 
Hence, for the first part of the theorem it suffices to prove that there exists 
a probability distribution Xp on (R x R x S, SS^_ x x 5^) such that 

Aj?((-oo, t] x (-oo, u]xA) = F(t, u, A) V((,ii,i)eRxlx ^. 

To do this, we use Theorem 11.3 in Billingsley (1986). Define Jif by 

Jf? = {(a,b] x (c,d] x A; -oo < a < b < oo, -oo <c<(i< oo,Ae 

Clearly, Jf? is a semiring generating x x =5^. Define a set function 
A F :^^M by 

X F ((a,b] x (c,d] x A) 

= F(b, d, A) - F(a, d, A) - F(b, c, A) + F(a, c, A) 

= P(C € (a, b] (1 (c,d],V £ A) + a(b){e'^ c - b ^+ - e^"^)^) 

- a(a)(e-~t [c - a] + - e-^ d - a] +)n(A) V(o,6] x (c,d] xieJf. 

Using the facts that <r satisfies (3.1) and that G is nondecr easing, it can be 
shown that X F is nonnegative. For example, if a < b < c < d, we get 

X F ((a,b] x (c,d] x A) 

= a(b){e-^ c - b) - e-^ d - b) )fi{A) 

- a(a){e-^ c - a) - e -^ d - a) )^(A) 

= {e^ c - e-"i d ){a{b)e~< b - a(a)e 7a )/i(^) > 0, 

while if a < c < b < d we get 

X F ((a,b] x (c,d\ x A) 

= P(C 6(476A) + a(6)(l - e- 7(d " b) ) / u(A) 

- (j(o)(e- 7(c - a) - e" 7(d - a) )^(A) 

= P(C e(c,b],VeA) + a(b)e^ b (e^ b - e^ d )fi(A) 

- cj(a)e 7a (e- 7C - e^ d )n(A) 

> P(C € (c, 6], F € A) - cj(o)e 7a (e- 7C - e~ T6 )/i(^) > 0. 

We now show that A^ is countably additive on Jff. In other words, we 
assume that (a, b] x (c,d] x A = Ui^i( a «A] x ( c i>^j] x Ai, where (a, b] x 
(c,d] x A G , (aj,fei] x (ci,dj] x e Jt? for each j e Z' + , and the sets 
{(aj,6j] x (ci,di] x Af,i E Z+} are disjoint, and show that 

oo 

(3.3) X F ((a,b] x (c,d] x A) = ^ A jP ((a i , bi] x (ci,di\ x Ai). 

i=i 
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Define F^:»xl^[0,l] by F A (t,u) = F(t,u,A) [where A is the same set 
as in (3.3)]. Define also the semiring Jtf* and the set function \p A : Jtff* — > R 

by 

Jtf* = {(a, b] x (c, d]; — oo < a < b < oo, — oo < c < d < oo}; 

\ FA ((a,b] x (c,d]) = A F ((a,6] x (c,d] x A) V(a,6] x (c,d] G JT*. 

Clearly, is continuous from above, and it was shown earlier that Xf a 
is nonnegative. It therefore follows from Theorem 12.5 in Billingsley (1986) 
that Xp A can be uniquely extended to a measure on (R x R, x which 
in turn implies that A^ A x [i is a measure on (R x R x S,3^^ x x 5^). 
Hence, 



X FA ((a,b] x (c,d])/i(A) =^AF A ((aj,6i] x (a, ck])n{Ai), 

i=l 

from which (3.3) will follow if we can show that 

oo 

£>(C G (a<, &i] n (q, di], V G 

i=l 

oo 

= 5^p(c g (at, &*] n (ci.di], y g 

i=l 

But this follows from the facts that 

oo 

P(C G (a, b] n (c,d], V G = ]TP(C G (a*, &j] n (cj, di], V G A)/^) 

and 

oo 

P(C G (a, b] fl (c,d],V £ A) = £>(C G (a*, &;] fl (cj,dj],y G A*). 

i=l 

This concludes the proof of the first part of the theorem. 

We next show that if a is chosen so that equality holds in (3.1), then G is 
nondecreasing and right-continuous. Let C G X 5? and define, for each 

t G R+, C* = {(x + t,y); (x,y) G C}. It is easy to show that (i/ 7 x /x)(C*) = 
e _7 *(^ 7 x /i)(C) for each i G R+. Hence, for each < s < i < oo, 

P((C - t, V) G C) _ P((C - a, ^) G C*- 8 )e-Tfr- 8 > 
(^x/x)(C) (^x/z)^*- 8 ) ' 

implying that 



(y 7 x f i)(C')>0 
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. p(((-.,y)ec'- s )e^ 

— lnl 7 777^ \ > G(S), 

cg^ r / x,y x /i)(C7*- s ) 

(i' 7 XM)(C)>0 

so G is nondecreasing. Next, fix t € M+ and choose a sequence {Cfc € x 
^5^; A; € Z' + } such that (i^ 7 x /_t)(Ck) > for each k € and 

Hm F((C-t,V)eC k ) _ . nf P((C-t,7)eC) 
fe->oo (i/ 7 x /i)(Cfc) Ce% x,y (z^ 7 x /u)(C) 

(y 7 XM)(C)>0 

For each k € Z' + and u € R+, define Cfc iU = Ck H ((«, co) x S 1 ) and C 7 /" = 
{(x — u, y); (x, y) € Ck )U }- Then, for each k € U + and each u £ R+ such that 
(z/ 7 x /i)(Cfe,«) > 0, 

Gt + n = mf ; / 

CG^ R / Xj^ (l/ 7 X /i)(C) 

(y 7 xm)(C)>0 

< gOC - t - u, V) € ggK^+g _ P(( C - t, F) € C M )e* 
(z/ 7 x/i)(C^) (v 7 x/i)(C M ) 

This implies that limsup M | G(i + u) < G(i), and since G is nondecreasing, 
it must be right-continuous. 

For the last part of the theorem, assume that a nonnegative random 
variable C can be defined on the same probability space as (£, V), such that 
P(£ < C) = 1 and Jgf ((C - *, V)|C < *, C > *) = ^7 X /i for each t e R+. Then, 

P((C - 1, g c) = (i/ 7 x /i)(C)P(c < t, C > t) 

+ p((C-t,vOeC|C>t,C>t)P(C>t,C>t) 

v(i,C)eR + x (# R , xy), 

which implies that c:R + — > [0,1], defined by <r(i) = P(C < > satis- 
fies (3.1). Moreover, (3.2) holds with er defined in this way, which implies 
that if a < b < c < d, then 

P(£e(a,&],CeM) 

= (r(6)(e- 7 ( c - fe ) - e" 7 ^) - a(a)(e-^ c - a ^ - e"^-")) 
= (e" 7C - e-^ d )(a(b)e^ b - a(a)e 7a ) > 0, 
so G is nondecreasing, and clearly also right-continuous. □ 

Theorem 3.2. Let (£, £, V) 6e a random variable taking values in (R + x 
R + x 5, x ^r, x ^), where {S,y) is a measurable space. Let fi be a 
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probability measure on (S,y). Define <r:R+ — > [0, 1] oy cr(£) = P(C < i, C > 
t). IfF(( < C) = 1 and jgf((C — t,V)\( <t,( > t) = v 7 xn for each t G R + , 
then JSf((C,C-C»VOlC<C)=^(CIC<0 x ^7 x A*» ^ ere 

F(C < t, C < = o-(t) + 7 / a(x) dx V< G M+ 

JO 

and 

P(C = C < t, ^ e A) 

= P(C < *, V e A) - /i(A)7 / <r(x) dx V (t, A) G M+ x y. 

Jo 

Conversely, j/P(C < C) = 1 <m<* ((C, C ~ C. V) IC < = ^(CIC <()xy 7 xp, 
i/ien JSf((C-t,V r )|C<t,C>*) = v i x M / or eac/i t G . 

Proof. From (3.2), and using bounded convergence, we get 
P(CG(0,t],C-CG(0,n],FGA) 

N . f- / (i — l)t it~\ fit it 

■ ,VeA 



lim VP (G . 

JV-oof-?' V V N N 
i=i 



it 



MA) Hm (l- e -7-)^ CT 



JV 

( e -7(*/^) _ e -l(t/N+u)^Y^ 



a 
i=l 

iV 



(i - l)t 



1=1 

V(t,n,A) GM + x R + x ^. 

The first sum telescopes. For the second sum, we note that a is Riemann 
integrable on [0,t]. This holds since the function — defined by 

G(t) = cx(£)e 7 *, is nondecreasing, hence Riemann-Stieltjes integrable on [0,t] 
with respect to a : M+ — ► [0, 1] , defined by a(t) = 1 — e~ yt ; see Theorems 6.9 
and 6.17 in Rudin (1976). This gives 

P(CG(0,t],C-CG(0,n],FGA) 

= (i(A)(l - e" 7 ") (a(t) - a{0) + a{x) dx 
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\/(t,u,A) G R + x R + x y. 
To complete the proof of the first part of the theorem, note that 
P(C = 0, C € (0, u], V G A) = n{A){l - e-T>(0), 

and that P(C = C < t, V G A) = P(C < t, V G A) - P(C < t, ( < C, V G A). For 
the second part of the theorem, 

P(C-t <u,V eA,(<t,(>t) 

= E(e^/{C < t, C < C})(e" 7i - e- 7 ^)/i(A) 

= p(C<t,C>t)(i-e- 7 ")^(A) V(t,ii,i)el + xIR + xy. □ 

Theorem 3.3. Lei f/ie conditions of Theorem 3.1 /ioZd loiift 5 a finite 
set, and let f : x S-> be the Radon- Nikodym derivative with respect 
to v~ x fi of the part of j£f(C, V) which is absolutely continuous with respect 
to Uj x fj,. Then, 

inf ¥{ ^~ t,V ]^ C) =e~'' t essinf f(x) G R+. 

C*G^ R / XJ^ (l/ 7 X xG(t,oo)xS 

(j> 7 Xm)(C)>0 

Proof. The ">" part is easy. For the "<" part, we use Theorem 35.8 
in Billingsley (1986). For each n G Z+, let & n be the cr-algebra generated 
by the sets {(k2~ n , (k + 1)2"™] x {s}; k G Z+, s G S}. It is well known that 
c(U^o^i) = x Therefore, for z/ 7 x /^-almost every x G M+ x S, 
f(x) is the limit of ratios of the kind appearing on the left-hand side. □ 

Remark 3.1. The strong memoryless time £ for which equality holds 
in (3.1) is optimal in the sense that P(£ < t\C > t) is maximized uniformly 
over all t G R+. 

Remark 3.2. Theorems 3.1 and 3.2 imply that £ is a strong memoryless 
time for ((, V) if and only if ((, ( - (, V) = X (vo, m,Vi) + (1 - X)(mA V 2 ), 
where the random variables x> VOi ?7i> ^i an d (^2,^2) are independent, x 
takes values in {0, 1}, r/i is exponentially distributed with mean 7" 1 and 
jg?(Vi) = fi. Clearly, a{t) = P(£ < t, ( > t) = F( X = l^e^^o) 1 { Vo < t}). 

Remark 3.3. Let S = {1}, and let / be the Radon-Nikodym derivative 
of JSf(C) with respect to the exponential distribution with mean 7" 1 . 

1. Assume that f{t) > lim^^oo f(u) = c > for all t G M+. Then, the optimal 
choice of cr is a(t) = ce _7i which, by Theorem 3.2, implies that P(x = 1) = 
c and 770 = 0. 
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2. Assume that / is nondecreasing. Then, the optimal choice of a is o~(t) = 
/(£)e -7t which, again by Theorem 3.2, implies that x — 1 an d IP(^o < £) = 
/(t)e-T* + P(C < t) for each i € R+. 

Remark 3.4. The strong memory less times were originally inspired by 
another construction, the strong stationary times used in Aldous and Dia- 
conis (1986, 1987) and Diaconis and Fill (1990) to bound the rate of con- 
vergence of a finite-state discrete-time Markov chain {rji;i € Z+} to the sta- 
tionary distribution fi. A strong stationary time T is a randomized stopping 
time such that JC(rji\T < i) = fi for each i S Z + . It seems unlikely that strong 
stationary times could be used (even in the restricted setting of discrete-time 
Markov chains) to solve the problem considered in the present paper, with- 
out significant modifications leading in the end to the construction of strong 
memoryless times. 

Strong memoryless times are also related to a construction due to Athreya 
and Ney (1978) and Nummelin (1978), known as splitting. This is an embed- 
ding of a discrete-time Markov chain on a general state space (satisfying an 
irreducibility condition) into another Markov chain on a larger state space 
which contains a recurrent single state. In general, this recurrent state need 
not be frequently observed, so splitting does not suffice (even in the discrete- 
time Markov chain setting) to solve the problem considered in the present 
paper. 

We end this section with lattice versions of the preceding theorems. The 
proofs are analogous to those above, but simpler, since right-continuity is 
trivial in the lattice case. 

Theorem 3.4. Let the conditions of Theorem 3.1 hold, with the follow- 
ing changes: M + is replaced by Z+, v~ is the geometric distribution with 
mean 7 _1 , and e~ 7 is replaced by 1 — 7 in the definition of G and in (3.2). 
Then, all the assertions of Theorem 3.1 remain valid. 

Theorem 3.5. Let the conditions of Theorem 3.2 hold, with the fol- 
lowing changes: K + is replaced by 7L+, and is the geometric distribution 
with mean 7 -1 . Then, all the assertions of Theorem 3.2 remain valid, with 
f^a{x)dx replaced by J2lZo°'^)- 

4. Markov renewal point processes. In this section we use the results in 
Section 3 to address the problem described in Section 1. Recall that we wish 
to find a bound for the total variation distance between the distribution of 
the number of points of an MRPP in (0,t] with states in B, and a suit- 
able compound Poisson distribution. We assume that the reader is familiar 
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with the basic theory of marked point processes. Good references are Rol- 
ski (1981), Pranken, Konig, Arndt and Schmidt (1982) and Port and Stone 
(1973). 

We begin with the definition of an MRPP. Let S = {1, .. . ,N}, and let 
{(C ) ^i+i); £ G Z} be a stationary discrete-time Markov chain taking values 
in (K + x S, £$r + xs), with a transition probability p such that p((t,s),-) = 
p(s, •) for each (t, s) G R+ x 5. Assume that {Vf;i G Z} is irreducible, and 
that < E(C,f ) < oo. (We collectively denote these conditions by CO.) 

For each AcS, let {(C/ 4 , l^+i); * G z } have the distribution jgf((Cf , Vf+i);* G 
Z\V S G A), and define {C// 1 ;? G Z} by = 0, E/V 4 = E$=oC/ for each 
i > 1, and C/ A = — E^iC/ fo r each i < —1. Define the point process H> A 
on (M x 5,^ Rx5 ) by = Eiez W/S G ■}■ ^ A is a Palm version 

(with respect to marks in A) of an MRPP. 

Next, define the point process f on (1 x 5, ^rxs) by 

(4.1) E(g(tf)) = V fl er f(RxS) , 

where ^^(^xs) are the nonnegative Borel functions on the space of counting 

measures on (K x 5, &r x s), T\ = min{i > 1;V A G A} and is the shift 
operator, defined by 0t(^f)((a,b] x •) = ^f((a + t,b + t] x •). This definition is 
independent of the choice of A, and ^ is a stationary marked point process. 
There exist random variables {(Ui,Vi);i G Z} (where • • • < U-i < t/o < < 
f/i < ■ • ■) such that *(•) = Eiez -^{(^t) ^) G •}• * is a stationary MRPP. 

The quantity that we are interested in can be expressed as ^((0,£] x B). 
We assume without loss of generality that B = S, since otherwise we can 
replace \l/ by its restriction to (K x B,3$r x b), which is also a stationary 
MRPP. 

Analogously, we may define, using a stationary discrete-time Markov chain 
{((?, Vj^i);i G Z} taking values in (Z + x S, 3$z+xs), a stationary MRPP in 
discrete time. In this case, for each Ac S, the distribution of \& is given by 
a discrete version of (4.1), where the integral is replaced by a sum over the 
integers {0, . . . , U A A — 1}. 

We now explain how to use strong memoryless times to embed a station- 
ary MRPP into another stationary MRPP which has favorable properties 
from the point of view of compound Poisson approximation. Consider a 
stationary discrete-time Markov chain {(C G Z} on the state space 

(R+ x S, &r + xs) with transition probability p, satisfying condition CO. De- 
note by Uy the exponential distribution with mean 7 -1 , and let \i be a 
probability measure on (S, 3&s). For each s G S, assume that a s : M + — > [0, 1] 
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satisfies 

(4.2, .,(,)< M Ml^ll^tll} Vt£ R + , 

(y 7 xyj)(C)>0 

and that G s :R + — > R + , defined by G s (t) = cr s (i)e 7 *, is nondecreasing and 
right-continuous. Assume also that J °° a s (t)dt > for at least one s E S. 
(We collectively denote these conditions by CI.) Let S = S U {0}, and let 
{(Cf ) ^i+i); i E Z} be a stationary discrete-time Markov chain on the state 
space (R + x S,3§ R x gr), with a transition probability p defined for each 
{s,s')eSxS by 



u 



p(s, [0, u] x {0}) = a s (u) + 7 / a s {t) dt, 

Jo 

p(s, [0, u] x {s'}) = p(s, [0, u] x {s'}) - M (a')7 / 

JO 

p(0,[0,«] x{0}) = (l- e )(l- e -^ £ )"), 

p(0, [0,n]x{ s '})= /t ( S 'Ml-e- (7/E), ') 1 

where e E (0,1). For each AcS, let $ A (-) = £ ie z /{(t/- 4 , ^ A ) E •} be the 
Palm version (with respect to marks in A) of the MRPP associated with 

{((f, Vf+i);i E Z}, and let ^° = *{°>. $ A is a point process on (R x 5, ^ Ex s)- 

Heuristically, is a frequently observed state for ^ A if e is small enough, 
and if the MRPP ^ A loses its memory quickly enough after each occurrence 
of a point. Let also *(•) = Y,i&i I {{Ui-,V i ) E •} be the stationary MRPP 

associated with {(Ci , ^+i); i £ Z}. 

The following fact is now crucial, since it implies that we have constructed 
an embedding: the restriction of f to (1 x S,£$m.xs) has the same distri- 
bution as ^. To see this, let Ci,...,Ck be disjoint subsets of R x S, let 
C\ = {(x + t, y); (x, y) E Ci} for each t £ R + and let m, . . . ,nk be nonnega- 
tive integers. Applying (4.1) with A = S gives 

e(u imo = n, } ) = grinkng!MzM^. 



E(£/ 5 



Clearly, we may replace \E' S '(-) by Siez I{(U S s, V s) E •}, where • • • < r_j < 

i z 

Tq =0 <rf <■■■ are the random integers {i E Z; V^ 5 E S 1 }. It is straightfor- 
ward to show, using Theorems 3.1 and 3.2 and the strong Markov property, 
that the random sequence {(U^ s — U^ s , V^ s )', i E Z} is a stationary Markov 
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chain with transition probability Vj that is, it has the same distribution as 
{(C G Z}. Hence, {(U s s ,V s s );i G Z} has the same distribution as 

{(Uf,Vf);i G Z}, and since ^ 5 (-) = J2i&z I{(Uf, Vf) G •}, the proof is com- 
plete. 

^Finally, we need the following tools. Define {(X?,Y°);i G Z} by (X?,Y°) = 
(U o,t® +1 — rf — 1), where • • • < < Tq = < < • • • are the random in- 

i 

tegers {i G Z; V® = 0}. The strong Markov property implies that — 
Xf,Y®);i G Z} is an i.i.d. sequence. Let = £ igZ J {W G •} be a 
point process on (BL x Z + , <$§r x z + )- By definition, this is a Palm version 
of a renewal reward process. Similarly, define {(Xi,Yi);i G Z} by (Xj,Yj) = 
{U n , Tj+i — Tj — 1), where • • • < r_i < ro < < T\ < ■ ■ ■ are the random in- 
tegers {i G Z; Vi = 0}, and let £(•) = X] ieZ /{(Xj, 1^) G •} be a point process 
on (R x Z + ,^]g X 2 + ). It is straightforward to show that £ is the stationary 
renewal reward process corresponding to £°. 

It is now easy to state and prove the main result of this section. It will be 
demonstrated in Section 5 that the bound given below can be expressed in 
terms of a small number of parameters obtained from the functions {a s ; s G 
S}, by solving a small number of systems of linear equations. 

Theorem 4.1. Let ^ be a stationary MRPP with state space S = {1, . . . ,N}, 
satisfying condition CO above. Let 7 > 0, let fj, be a probability measure on 
(S,3$s) and assume that the functions {o~ s :M+ -*[0,l];s€ S} satisfy con- 
dition CI above. Then, 

d T v(^W(0,t] x S)),POIS(tt)) 

(4-3) 

< gCgVi) | gi( ^ 3flETO / E(XM + E((X 1 °) 2 )E(y °) 



E(X°) 1V ; E(X 1 °) V E(X 1 °) E(Xj ) ) 2 

where tti = E ^ ^ P(y = i) for i>l, and 

^lA-je" 1 ", always, 

1A ^(i(^) +log+ < 2 <"-^»)- 



ill vr < ^ 



ifi^i >(i+ l)vr i+ i Vi > 1, 
if0<\, 



I (1-20)A' 

A 



where \ = Y^°l 1 iTTi and 9 = i ~~ l) 71 ^- 



Proof. The fact that JS?(¥((0,t] x S)) = Jgf (¥(((),<] x S)) and the tri- 
angle inequality imply that 

drv(j2?(tt((0,f] x 5)),POIS(tt)) 
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<t*rv(jS?(¥((0,t] x£)),JSf( / 

V \./(0,t]xZ' + 

+ d TY (^( [ vd£(u,v) J,POIS(7r)Y 

V V(o,*]xzy / / 

For the first term on the right-hand side, the basic coupling inequality and 
(4.1) give 

d TV ^(*((0,t] x ^))^( i / (0t]xz; < 2F(Fi E S) = . 

For the second term, since £ is a stationary renewal reward process, Theo- 
rem 5.1 in Erhardsson (2000b) gives a bound which equals the second term 
on the right-hand side in (4.3). The proof of Theorem 5.1 in Erhardsson 
(2000b) uses the coupling version of Stein's method for compound Poisson 
approximation. The last of the three bounds for the Stein constant H\(ir) is 
due to Barbour and Xia (1999). □ 

We finally give, without proof, the lattice version of the preceding theo- 
rem. 

Theorem 4.2. Let the conditions of Theorem 4.1 hold, with the follow- 
ing changes: K+ is replaced by Z+, v~ is the geometric distribution with mean 
7" 1 , and e -7 is replaced byl — j in the definition of G s for each s 6 S. Then, 
the bound (4.3) remains valid, with E((Xj ) ) 2 ) replaced by E(X^(X% - 1)). 

5. Application to renewal counts. The bound (4.3) does not at first sight 
seem explicit. However, by using the Markov property and solving a small 
number of systems of linear equations of dimension at most TV, it is possible 
to express all quantities appearing in (4.3) in terms of 7, /x, {E(£o Ityi = 
s'}\V s = s)^( S ,s')eSxS}, {E((C$) 2 \V s = s);s£S}, {/ °° a s (t)dt;s € S} 
and {Jq°° f£° a s (t) dtdu; s € S}. 

Below, we consider an important special case. We give a bound for the 
total variation distance between the distribution of the number of points in 
(0,t] of a stationary renewal process in continuous time and a compound 
Poisson distribution. 

By z/y we mean the exponential distribution with mean 7 . 

Theorem 5.1. Let ^ be a stationary renewal point process on (R,^r) 
with generic interrenewal time £. Let f be the Radon-Nikodym derivative 
of the absolutely continuous part of J^iC) with respect to u^. Assume that 
a : M+ — > [0, 1] satisfies 

(5.1) o-(t)<e-~ ft inf f{x) Vi £ R + , 

a:G(t,oo) 
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and that G:M.+ — > [0,1], defined by G(t) = cr(i)e 7t , is nondecreasing and 
right- continuous; these conditions are satisfied if equality holds in (5.1). Let 
Co = 7 Jq°° cr(t) dt and c\ = 7 /q°° a(t) dt du. Assume that cq > 0. Then, 

drv(jSf(*((0,t])),POIS(7r)) 



<Hi{ir) 



3t /E(C)- 7 - 1 c , E(C) 



co 



cx + E(C 2 ) 



27-^ 



+ 



co E(C) 
2(E(C)-ci)(E(C)-7" 1 co) 



c E(C) 



where itj 



+ 



2(E(C)-7~ 1 c ] 



E(C) 

co)' t_1 co fori>\, 



7T 



F co 



Al exp(||7r| 



1 



1 



7r||co(2co — 1) \4||7r||co(2co — 1 



teo/E(C), and 
if c G (0,1], 

+ log + (2|M|c (2c -l)) ) Al, 



r 2 
c 



|7t||(5c -4) ' 



z/co E [2,1], 
t/coG (§,!]. 



Proof. We shall compute the bound (4.3) in the case S = {1} for a fixed 
e, and let e — ► 0. All quantities appearing in (4.3) can be expressed in terms 
of 7, E(£), E(£ 2 ), co and ci, by solving a small number of systems of linear 
equations. To do this, recall from Section 4 the definitions of the Markov 
chain {(Ci ,Vi+i)]i G Z} and the random sequences {((f, V^); i G Z} and 
{(X?,Y?);i E Z}. Also, let n = min{z G Z^jF/ = 0} and = mm{i E 
Z' + ;V? = 0}. 

1. Clearly, F(Y ° = k) = P(r 1 ° = k + 1) = e(l - c ) fe - 1 c for each fc E Z' + . In 
particular, E(Y °) =e/cq. 

2. Define /»o : {0, 1} -> M+ by h (s) = E^i^ 1 C^lV^f = a). Conditioning on 
(Co an d using the Markov property, we see that E(Jf°) = /io(0) = 
67" 1 + e/i (l) and h (l) = E(C<f |V£ S = 1) + (1 - c )/i (l). From the def- 
inition of p we see that E(£q'|V s ' = 1) = E(£) — 7 _1 co- It follows that 
ho(l) = (E(C) - 7~ 1 co)/c and E(X°) = eE(C)/c . 

3. Define /ii : {0, 1} -> M+ and h 2 : {0, 1} -» M + by /ii(s) = E(Y]Jl" 1 (Cf ) 2 | 



- s) and h 2 {s) = E(EIio 1 Cf EjL~+i CjW = «), respectively. Again 



conditioning on (£q\ ) and using the Markov property, we see that 
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E((X?) 2 ) = mE-Lo 1 Cf) 2 |Vo S = 0) = 2,V 2 + eE((32? = o 1 C?)W = 1) + 
2e 2 7 - 1 /i (l) = 2e 2 7~ 2 + e/ii(l) + 2eh 2 (l) + 2e 2 ^h (l). Also,_/ii(l) = 

E((Cc?) W = 1) + (1 - co)/ii(l), and h 2 {\) = E(£$I{Vf = 1}\V* = 1) x 

ho(l) + (1 — co)/i2(l)- Again, from the definition of p we see that K((C^) 2 | 

V S = 1) = E(C 2 ) - 2 7 - 1 ci, and E(^I{V? = 1}|F 5 = 1) = E(C) - ci. It 
follows that fti(l) = (E(C 2 ) - 2 7 - 1 ci)/c and h 2 (l) = (E(C) - ci)(E(C) - 
7 _1 c )/cg. Hence, 

E((X 1 ) 2 ) = 2eV 2 + 2 ( £ V lE (0-gV 2 co) + g E(C 2 )-2 £7 ^ Cl 

co c 

2(E(C)-c 1 )( £ E(C)-£7~ 1 c ) 

r 2 
c 

4. Define ^ : {0, 1} M+ and h A : {0, 1} - R+ byji 3 (s) = E(££^ E^ 1 C/| 
1/ 5 = s) and ft 4 (s) = EQ^ 1 Cf (n - 1 - %)\Vq = a). Yet again condition- 
ing on (Co ,Vi ) and using the Markov property, we see that E(X^Yq°) = 

ISES'fe " l)|^ = 0) = e a 7 - 1 E(n|^= l)+^(r 1 E[io 1 CW = 
1) = e 2 7 - 1 E(r 1 |y 5 = 1) + e/i 3 (l) +eh A {l). Likewise, h 3 (l) = h (l) + (1 - 

c )h 3 (l), and /i 4 (l) =E(C6 5 /{^ 1 5 = l}|y S = 1)E(ti|^ 5 = l) + (l-co)^(l). 
It follows that h 3 (l) = (E(C) - 7 ~ 1 c )/co and h 4 (l) = (E(C) - ci)/cg. 
Hence, 

M^l y J - — — + ~2 + -2 • 

co c c 

5. It holds that E(£r° M ) = E(E[io 2 (f 1^ = 0) < E(^/{V^ = 1}|V^ = 
0) +efco(l) = e 2 7 ~ 1 V (eE(C) - ery-^/cQ. 

We finally let e -» in (4.3). □ 

Remark 5.1. In order to clarify what is needed to make the bound in 
Theorem 5.1 small, recall from Remark 3.2 the representation £ = xi'Ho + 
771) + (1 — x)r?2, where the random variables x> ijo, Vi an d V2 are independent, 
X takes values in {0, 1} and rji is exponentially distributed with mean 7 _1 . 
It is easy to see that P(x = 1) = Co and E(x(??o + Vi)) = ci, implying that 
E(C) " 7 _1 co = c E(t ?0 ) + (1 - c )E(r ?2 ), E(C) - ci = (1 - c )E(t ?2 ) and E(C 2 ) - 
2 7 - 1 c 1 = Co E(7 ? 2 ) + (l- Co )E(r ? 2 ). 

As a consequence, assume that Co > c > and < o < £/E(£) < 6 < 00 (if 
c > |, the second condition is not needed). Then, the bound in Theorem 5.1 
is bounded above and below by a positive constant times the expression 

EM E(r? 2 ) (1-c )E(t ?2 ) (1-c )E( V 2 2 ) 



max 



e(c)'e(o 2 ' e(o ' E(cy 
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Remark 5.2. The bound given in Theorem 5.1 simplifies further if Jz?(C) 
has a Radon-Nikodym derivative / with respect to z/ 7 for some 7 > 0, and 
infj-^Qo) f(x) = c > for each t € M+. It is then clear that we may choose 
Co = c and c\ = 7 -1 c. 

For example, assume that -^(C) is DFR (decreasing failure rate), and the 
failure rate has a strictly positive limit 7 > 0. It then follows from Remark 
4.9 in Brown (1983) that f(x) decreases monotonically as x — > 00 to a limit 
c > 0. If c> 0, we are in the case just described. 

Remark 5.3. Assume that ^ is a Poisson process, that is, that J? (£) = 
My for some 7 > 0. Then, from Remark 5.2, Co = 1 and c\ = 7 -1 , so the 
bound given in Theorem 5.1 is 0. The approximating distribution POIS(-7r) 
is Po(*7). 
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